WISE 2014 Challenge: Multi-label Classification of Print Media Articles to Topics
نویسندگان
چکیده
The WISE 2014 challenge was concerned with the task of multi-label classification of articles coming from Greek print media. Raw data comes from the scanning of print media, article segmentation, and optical character segmentation, and therefore is quite noisy. Each article is examined by a human annotator and categorized to one or more of the topics being monitored. Topics range from specific persons, products, and companies that can be easily categorized based on keywords, to more general semantic concepts, such as environment or economy. Building multi-label classifiers for the automated annotation of articles into topics can support the work of human annotators by suggesting a list of all topics by order of relevance, or even automate the annotation process for media and/or categories that are easier to predict. This saves valuable time and allows a media monitoring company to expand the portfolio of media being monitored. This paper summarizes the approaches of the top 4 among the 121 teams that participated in the competition.
منابع مشابه
Exploiting Associations between Class Labels in Multi-label Classification
Multi-label classification has many applications in the text categorization, biology and medical diagnosis, in which multiple class labels can be assigned to each training instance simultaneously. As it is often the case that there are relationships between the labels, extracting the existing relationships between the labels and taking advantage of them during the training or prediction phases ...
متن کاملMulti-label Classification Using Hypergraph Orthonormalized Partial Least Squares
In many real-world applications, humangenerated data like images are often associated with several semantic topics simultaneously, called multi-label data, which poses a great challenge for classification in such scenarios. Since the topics are always not independent, it is very useful to respect the correlations among different topics for performing better classification on multi-label data. H...
متن کاملMLIFT: Enhancing Multi-label Classifier with Ensemble Feature Selection
Multi-label classification has gained significant attention during recent years, due to the increasing number of modern applications associated with multi-label data. Despite its short life, different approaches have been presented to solve the task of multi-label classification. LIFT is a multi-label classifier which utilizes a new strategy to multi-label learning by leveraging label-specific ...
متن کاملContent Analysis of Media Coverage of Childhood Obesity Topics in UAE Newspapers and Popular Social Media Platforms, 2014-2017
The 2017 prevalence of obesity among children (age 5–17 years) in the United Arab Emirates (UAE) is 13.68%. Childhood obesity is one of the 10 top health priorities in the UAE. This study examines the quality, frequency, sources, scope and framing of childhood obesity in popular social media and three leading UAE newspapers from 2014 to 2017. During the review period, 152 newspaper articles fro...
متن کاملSAPKOS: Experimental Czech Multi-label Document Classification and Analysis System
This paper presents an experimental multi-label document classification and analysis system called SAPKOS. The system which integrates the state-of-the-art machine learning and natural language processing approaches is intended to be used by the Czech news Agency (ČTK). Its main purpose is to save human resources in the task of annotation of newspaper articles with topics. Another important fun...
متن کاملذخیره در منابع من
با ذخیره ی این منبع در منابع من، دسترسی به آن را برای استفاده های بعدی آسان تر کنید
عنوان ژورنال:
دوره شماره
صفحات -
تاریخ انتشار 2014